Dashboards are powerful tools that enable stakeholders to visualize and interpret data, uncovering trends and actionable insights. In this lab, you will use IBM Cognos Analytics or Google Looker to design an interactive dashboard that highlights key performance metrics for HealthFirst Care. This dashboard will serve as a decision-support tool, addressing areas such as patient wait times, resource utilization, and patient satisfaction levels.
import numpy as np
#from numpy import count_nonzero, median, mean
import pandas as pd
from pandas.plotting import scatter_matrix
import matplotlib.pyplot as plt
import seaborn as sns
import random
# Plotly
import plotly.express as px
import plotly.offline as py
import plotly.graph_objs as go
import datetime
from datetime import datetime, timedelta, date
%matplotlib inline
#sets the default autosave frequency in seconds
%autosave 60
sns.set_style('dark')
sns.set(font_scale=1.2)
#sns.set(rc={'figure.figsize':(14,10)})
plt.rc('axes', titlesize=9)
plt.rc('axes', labelsize=14)
plt.rc('xtick', labelsize=12)
plt.rc('ytick', labelsize=12)
import warnings
warnings.filterwarnings('ignore')
pd.set_option('display.max_columns',None)
#pd.set_option('display.max_rows',None)
pd.set_option('display.width', 1000)
pd.set_option('display.float_format','{:.2f}'.format)
random.seed(0)
np.random.seed(0)
np.set_printoptions(suppress=True)
Autosaving every 60 seconds
df1 = pd.read_csv("Cleaned_Appointment.csv", parse_dates=["Date"], dayfirst=True)
df1.head()
| index | AppointmentID | PatientID | DoctorID | Department | Date | Time | Status | |
|---|---|---|---|---|---|---|---|---|
| 0 | 1 | A7515 | P16262 | D1 | Orthopedics | 2024-09-10 | 13:00 | No Show |
| 1 | 2 | A7537 | P99601 | D30 | Cardiology | 2024-12-17 | 15:00 | Completed |
| 2 | 3 | A8866 | P55235 | D30 | Neurology | 2024-07-09 | 15:30 | Rescheduled |
| 3 | 4 | A3039 | P93372 | D23 | Pediatric | 2024-08-12 | 18:00 | Cancelled |
| 4 | 5 | A3237 | P84444 | D1 | Pediatric | 2024-03-06 | 17:30 | Rescheduled |
df1.dtypes
index int64 AppointmentID object PatientID object DoctorID object Department object Date datetime64[ns] Time object Status object dtype: object
df2 = pd.read_csv("Cleaned_Feedback.csv")
df2.head()
| index | FeedbackID | PatientID | Department | Feedback Score | Comments | WaitTime | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | F1000 | P99601 | Cardiology | 4.00 | Delayed response | 45.00 |
| 1 | 1 | F1001 | P36565 | Orthopedics | 3.00 | Needs improvement | 45.00 |
| 2 | 3 | F1003 | P67499 | Neurology | 9.00 | Amazing team | 30.00 |
| 3 | 4 | F1004 | P49994 | Pediatric | 8.00 | Great doctors | 45.00 |
| 4 | 5 | F1005 | P31843 | Pediatric | 5.00 | Satisfactory | 45.00 |
df3 = pd.read_csv("Cleaned_Pivot1.csv", parse_dates=["Date"], dayfirst=True)
df3.head()
| PatientID | Department_x | Date | Time | Status | Feedback Score | WaitTime | Satisfaction Level | |
|---|---|---|---|---|---|---|---|---|
| 0 | P16262 | Orthopedics | 2024-09-10 | 13:00 | No Show | 9.00 | 30.00 | High |
| 1 | P99601 | Cardiology | 2024-12-17 | 15:00 | Completed | 4.00 | 45.00 | High |
| 2 | P55235 | Neurology | 2024-07-09 | 15:30 | Rescheduled | 6.00 | 45.00 | High |
| 3 | P93372 | Pediatric | 2024-08-12 | 18:00 | Cancelled | 4.00 | 45.00 | High |
| 4 | P84444 | Pediatric | 2024-03-06 | 17:30 | Rescheduled | 8.00 | 45.00 | High |
df4 = pd.read_csv("Cleaned_Resource.csv", parse_dates=["Date"], dayfirst=True)
df4.head()
| index | ResourceID | Department | ResourceType | UsageHours | Date | Availability | |
|---|---|---|---|---|---|---|---|
| 0 | 0 | R5487 | Cardiology | Nurse | 2 | 2024-07-28 | Available |
| 1 | 1 | R8097 | Orthopedics | Doctor | 0 | 2024-09-10 | Unavailable |
| 2 | 2 | R2557 | General Medicine | Room | 0 | 2024-12-17 | Under Maintenance |
| 3 | 3 | R9024 | Neurology | Doctor | 9 | 2024-07-09 | Available |
| 4 | 5 | R2629 | Oncology | Doctor | 4 | 2024-03-06 | Available |
fig = px.histogram(data_frame=df2, x="Feedback Score", nbins=20, title="Histogram of Feedback Score")
fig.show()
fig = px.histogram(data_frame=df2, x="WaitTime", nbins=5, title="Histogram of Wait Time")
fig.update_traces(marker_color='orange')
fig.update_layout(
xaxis_title="Wait Times",
yaxis_title="Count"
)
fig.show()
fig = px.bar(data_frame=df4, x="Department", y="UsageHours", title="Usage Hours by Dept")
fig.update_traces(marker_color='red')
fig.update_layout(
xaxis_title="Department",
yaxis_title="Usage Hours"
)
# Show the figure
fig.show()
fig = px.bar(data_frame=df4, x="ResourceType", y="UsageHours", title="Usage Hours by Resource Type")
fig.update_traces(marker_color='lime')
fig.update_layout(
xaxis_title="Resource Type",
yaxis_title="Usage Hours"
)
# Show the figure
fig.show()
fig = px.scatter(
data_frame=df2,
x="Feedback Score",
y="WaitTime",
title="Scatterplot")
fig.update_layout(
xaxis_title="Feedback Score",
yaxis_title="Wait Time"
)
fig.show()
# fig = px.scatter(data_frame=df, x="", y="", color="continent", title="",
# size="pop", size_max=10, hover_name="country")
# fig.show()
What is a line plot and why use it?
A line chart or line plot is a type of plot which displays information as a series of data points called 'markers' connected by straight line segments. It is a basic type of chart common in many fields. Use line plot when you have a continuous data set. These are best suited for trend-based visualizations of data over a period of time.
Line plot is a handy tool to display several dependent variables against one independent variable. However, it is recommended that no more than 5-10 lines on a single graph; any more than that and it becomes difficult to interpret.
fig = px.line(data_frame=df3, x="Date", y="WaitTime", title="Wait Time Trends 1", color="Department_x")
fig.update_layout(
xaxis_title="Date",
yaxis_title="Wait Time"
)
fig.show()
fig = px.line(data_frame=df3, x="Time", y="WaitTime", title="Wait Time Trends 2", color="Department_x")
fig.update_layout(
xaxis_title="Time",
yaxis_title="Wait Time"
)
fig.show()
fig = px.scatter(data_frame=df2, x="Feedback Score", y="WaitTime", trendline="ols", title="Trendline")
fig.update_layout(
xaxis_title="Feedback Score",
yaxis_title="Wait Time"
)
fig.show()
# fig = px.scatter_matrix(data_frame=df, title="Heatmap", width=2000, height=2000,
# labels={col:col.replace('_', ' ') for col in df.columns})
# fig.show()
A box plot is a way of statistically representing the distribution of the data through five main dimensions:
minimum and the median.median and maximum.fig = px.box(data_frame=df3, y="Feedback Score", title="Feedback Score Boxplot")
fig.show()
fig = px.box(data_frame=df3, y="WaitTime", title="Wait Time Boxplot")
fig.show()
fig = px.violin(data_frame=df4, y="UsageHours", title="Violinplot for Usage Hours")
fig.show()
# Pivot the DataFrame
heatmap_data = df4.pivot_table(values='UsageHours',
index='Department',
columns='ResourceType',
aggfunc='sum') # Summing usage hours for each resource type per department
# Create the heatmap
fig = px.imshow(heatmap_data,
color_continuous_scale='YlOrRd', # Color scale from yellow to red (low to high usage)
text_auto=True, # Show values inside the cells
labels={'x': 'Resource Type', 'y': 'Department', 'color': 'Usage Hours'},
title="Resource Usage Heatmap by Department")
# Resize the heatmap
fig.update_layout(
width=1000, # Set the width of the heatmap
height=700, # Set the height of the heatmap
)
# Show the figure
fig.show()
# fig = px.area(data_frame=df, x="", y="", color="continent", line_group="country", title="",
# labels={'actual_productivity': 'Actual Productivity'})
# fig.show()
A pie chart is a circular graphic that displays numeric proportions by dividing a circle (or pie) into proportional slices. You are most likely already familiar with pie charts as it is widely used in business and media. We can create pie charts in Matplotlib by passing in the kind=pie keyword.
autopct - is a string or function used to label the wedges with their numeric value. The label will be placed inside the wedge. If it is a format string, the label will be fmt%pct.startangle - rotates the start of the pie chart by angle degrees counterclockwise from the x-axis.shadow - Draws a shadow beneath the pie (to give a 3D feel).df3.columns
Index(['PatientID', 'Department_x', 'Date', 'Time', 'Status', 'Feedback Score', 'WaitTime', 'Satisfaction Level'], dtype='object')
# Group by 'Satisfaction Level' and count the number of patients in each category
satisfaction_counts = df3.groupby('Satisfaction Level')['PatientID'].count().reset_index()
satisfaction_counts
| Satisfaction Level | PatientID | |
|---|---|---|
| 0 | High | 164 |
| 1 | Low | 27 |
# Calculate the percentage of patients in each category
satisfaction_counts['Percentage'] = (satisfaction_counts['PatientID'] / satisfaction_counts['PatientID'].sum()) * 100
satisfaction_counts
| Satisfaction Level | PatientID | Percentage | |
|---|---|---|---|
| 0 | High | 164 | 85.86 |
| 1 | Low | 27 | 14.14 |
# Create a pie chart
fig = px.pie(
satisfaction_counts,
names='Satisfaction Level',
values='Percentage',
title='Patient Satisfaction Levels (High vs Low)',
color='Satisfaction Level', # Color by Satisfaction Level
color_discrete_map={'High': 'green', 'Low': 'red'}, # Custom colors
labels={'Satisfaction Level': 'Satisfaction Level', 'Percentage': 'Percentage of Patients'}
)
fig.update_layout(
width=1000, # Set the width
height=700, # Set the height
)
# Show the figure
fig.show()
# fig = px.choropleth(data_frame=df, locations="iso_alpha", color="lifeExp", hover_name="country",
# animation_frame="year", title="",
# color_continuous_scale=px.colors.sequential.Plasma, projection="natural earth")
# fig.show()